Use the Subgroup 2D Block Encoding in LoadStoreOpToLLVM #4500
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR marks loads with Subgroup 2D Block Encoding as expensive loads so the layout will be preserved. The code in LoadStoreOpToLLVM::rewriteTensorPointer is modified to support loads with either the block io tag and DPAS layout, or loads with Subgroup 2D Block Encoding. Both are handled similarly with implicit conversion from the loaded values in registers to the DPAS layout via register shuffles. The ConvertLayout op left in the Subgroup 2D Block Encoding case is deleted.
I have one issue where I seem to be inserting a barrier somewhere which is causing a performance regression. I need to determine why that is happening, then this should be ready for review.close #4499
depends on #4463
depends on #4510